Deep Spoken Keyword Spotting: An Overview
نویسندگان
چکیده
Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams and has become a fast-growing technology thanks to paradigm shift introduced by deep learning few years ago. This allowed rapid embedding KWS myriad small electronic devices different purposes like activation voice assistants. Prospects suggest sustained growth terms social use this technology. Thus, it is not surprising that hot research topic among speech scientists, who constantly look for performance improvement computational complexity reduction. context motivates paper, which we conduct literature review into spoken assist practitioners researchers are interested Specifically, overview comprehensive nature covering thorough analysis systems (which includes features, acoustic modeling posterior handling), robustness methods, applications, datasets, evaluation metrics, audio-visual KWS. The performed paper allows us identify number directions future research, including adopted from automatic recognition unique problem
منابع مشابه
Spoken keyword spotting via multi-lattice alignment
We propose a method for finding keywords in an audio database using a spoken query. Our method is based on performing a joint alignment between a phone lattice generated from a spoken utterance query and a second phone lattice representing a long utterance needing to be searched. We implement this joint alignment procedure in a graphical models framework. We evaluate our system on TIMIT as well...
متن کاملTransferable Deep Features for Keyword Spotting
Deep features, defined as the activations of hidden layers of a neural network, have given promising results applied to various vision tasks. In this paper, we explore the usefulness and transferability of deep features, applied in the context of the problem of keyword spotting (KWS). We use a state-ofthe-art deep convolutional network to extract deep features. The optimal parameters concerning...
متن کاملUnsupervised Spoken Keyword Spotting and Learning of Acoustically Meaningful Units
The problem of keyword spotting in audio data has been explored for many years. Typically researchers use supervised methods to train statistical models to detect keyword instances. However, such supervised methods require large quantities of annotated data that is unlikely to be available for the majority of languages in the world. This thesis addresses this lack-of-annotation problem and pres...
متن کاملDeep Residual Learning for Small-Footprint Keyword Spotting
We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark. Our best residual network (ResNet) implementation significantly outperforms Google’s previous convolutional neural networks in terms of accuracy. By varying model depth and width, we can achieve compact models th...
متن کاملDiscriminative keyword spotting
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based on nonlinearly mapping the input acoustic representat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2022
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3139508